Search CORE

11 research outputs found

New Frameworks for Offline and Streaming Coreset Constructions

Author: Braverman Vladimir
Feldman Dan
Lang Harry
Statman Adiel
Zhou Samson
Publication venue
Publication date: 18/09/2022
Field of study

A coreset for a set of points is a small subset of weighted points that approximately preserves important properties of the original set. Specifically, if

P

is a set of points,

Q

is a set of queries, and

f:P\times Q\to\mathbb{R}

is a cost function, then a set

S\subseteq P

with weights

w:P\to[0,\infty)

is an

\epsilon

-coreset for some parameter

\epsilon>0

\sum_{s\in S}w(s)f(s,q)

is a

(1+\epsilon)

multiplicative approximation to

\sum_{p\in P}f(p,q)

for all

q\in Q

. Coresets are used to solve fundamental problems in machine learning under various big data models of computation. Many of the suggested coresets in the recent decade used, or could have used a general framework for constructing coresets whose size depends quadratically on what is known as total sensitivity

t

. In this paper we improve this bound from

O(t^2)

O(t\log t)

. Thus our results imply more space efficient solutions to a number of problems, including projective clustering,

k

-line clustering, and subspace approximation. Moreover, we generalize the notion of sensitivity sampling for sup-sampling that supports non-multiplicative approximations, negative cost functions and more. The main technical result is a generic reduction to the sample complexity of learning a class of functions with bounded VC dimension. We show that obtaining an

(\nu,\alpha)

-sample for this class of functions with appropriate parameters

\nu

and

\alpha

suffices to achieve space efficient

\epsilon

-coresets. Our result implies more efficient coreset constructions for a number of interesting problems in machine learning; we show applications to

k

-median/

k

-means,

k

-line clustering,

j

-subspace approximation, and the integer

(j,k)

-projective clustering problem

arXiv.org e-Print Archive

Synaptic Size Dynamics as an Effectively Stochastic Process

Author: Adiel Statman (638527)
Amir Minerbi (262841)
Maya Kaufman (149247)
Naama Brenner (28701)
Noam E. Ziv (147889)
Publication venue
Publication date: 01/10/2014
Field of study

<div>Long-term, repeated measurements of individual synaptic properties have revealed that synapses can undergo significant directed and spontaneous changes over time scales of minutes to weeks. These changes are presumably driven by a large number of activity-dependent and independent molecular processes, yet how these processes integrate to determine the totality of synaptic size remains unknown. Here we propose, as an alternative to detailed, mechanistic descriptions, a statistical approach to synaptic size dynamics. The basic premise of this approach is that the integrated outcome of the myriad of processes that drive synaptic size dynamics are effectively described as a combination of multiplicative and additive processes, both of which are stochastic and taken from distributions parametrically affected by physiological signals. We show that this seemingly simple model, known in probability theory as the Kesten process, can generate rich dynamics which are qualitatively similar to the dynamics of individual glutamatergic synapses recorded in long-term time-lapse experiments in ex-vivo cortical networks. Moreover, we show that this stochastic model, which is insensitive to many of its underlying details, quantitatively captures the distributions of synaptic sizes measured in these experiments, the long-term stability of such distributions and their scaling in response to pharmacological manipulations. Finally, we show that the average kinetics of new postsynaptic density formation measured in such experiments is also faithfully captured by the same model. The model thus provides a useful framework for characterizing synapse size dynamics at steady state, during initial formation of such steady states, and during their convergence to new steady states following perturbations. These findings show the strength of a simple low dimensional statistical model to quantitatively describe synapse size dynamics as the integrated result of many underlying complex processes.</div

Directory of Open Access Journals

PubMed Central

FigShare

Properties of the Kesten process in estimated parameter regime.

Author: Adiel Statman (638527)
Amir Minerbi (262841)
Maya Kaufman (149247)
Naama Brenner (28701)
Noam E. Ziv (147889)
Publication venue
Publication date
Field of study

(A) Simulated synaptic trajectories of 14 out of 1075 synapses, evolved for 160 hours at 30 min intervals. Synapses were sorted according to initial size and then every 76th trajectory was selected for display (compare with <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi-1003846-g001" target="_blank">Fig. 1D</a>). The Kesten process parameters used here were based on the estimate shown in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi-1003846-g004" target="_blank">Fig. 4</a> ( = 0.9923±0.05; 〈η〉 = 0.0077±0.03) and values were obtained from Gaussian distributions with these parameters. The initial data set (1087 synapses) was identical to that shown in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi-1003846-g002" target="_blank">Figs. 2A</a> and <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi-1003846-g004" target="_blank">4</a>; 12 synapses were ‘lost’ during the simulation (i.e. their values reduced to 0) and were excluded from subsequent analysis. (B) Synaptic distributions along time, starting from a measured distribution (thick black line) and applying the time evolution of the Kesten process to this initial population. Four subsequent time points are plotted as indicated. Inset shows the same distributions on a semi-logarithmic scale. (C,D) Examples of k-times iterated mappings corresponding to 24 and 48 time-steps (compare with <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi-1003846-g004" target="_blank">Fig. 4C,D</a>). (E) Slope of k-times iterated mappings as a function of k in simulated trajectories (circles) and in a theoretical prediction based on <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi.1003846.e003" target="_blank">Eq. (3)</a> (red solid line, red equation). (F) Scatter plot of changes in synapse size as a function of initial size for simulated trajectories for the period covering first 24 hours of the simulation. Note the strong resemblance with the experimental measurements of <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi-1003846-g002" target="_blank">Fig. 2A</a>.</p

FigShare

nar

Author: Adiel Statman (638527)
Amir Minerbi (262841)
Maya Kaufman (149247)
Naama Brenner (28701)
Noam E. Ziv (147889)
Publication venue
Publication date
Field of study

ne'erWe didn't take nar copper from him.YesDNE-cit WK 63Used I and SupUsed I and SupUsed

Memorial University Newfoundland Digital Archive Initiative

FigShare

Estimating Kesten parameters in experimental data.

Author: Adiel Statman (638527)
Amir Minerbi (262841)
Maya Kaufman (149247)
Naama Brenner (28701)
Noam E. Ziv (147889)
Publication venue
Publication date
Field of study

An estimate of the parameter can be obtained from k-times iterated mappings of the data as explained in text. These mappings are shown for 1, 8, 24 and 48 time-steps, corresponding to 0.5, 4, 12 and 24 hours respectively (A–D); from each such mapping the slope of the linear regression (solid black line) is extracted. (E) The logarithmic values of these slopes (circles) plotted as a function of iteration number and fit by linear regression (solid black line) to obtain an estimate of . (F) The measured slopes (circles) with the predicted slope values (red line) over an extended time scale.</p

FigShare

Distribution rescaling with individual rank-order shuffling in the Kesten process.

Author: Adiel Statman (638527)
Amir Minerbi (262841)
Maya Kaufman (149247)
Naama Brenner (28701)
Noam E. Ziv (147889)
Publication venue
Publication date
Field of study

The Kesten process provides a simple mechanism for population distribution rescaling without individual multiplication by a constant factor. Simulations were performed for 127 synapses (initial values taken from the synapses of <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi-1003846-g007" target="_blank">Fig. 7</a>). The synapses were first evolved for 24 hours (48 time points) with a Kesten process that preserved the original distribution. At this point was slightly increased (from 0.992 to 0.995), and the trajectories were evolved for another 24 hours with the new parameters. (A) Distributions before (blue) and after (red) changing . (B) Same distributions shown in (A) after scaling. (C) Changes in the fluorescence of individual synapses (ΔF) during the first 24 hours after changing (averages and standard deviations of binned data). The green line represents the expected relationships between ΔF and F had sizes of individual synapses scaled through multiplication by 1.14 (the ratio of mean synaptic size before and after changing . (D) Scaling without preserving rank order. Synapses were sorted according to their size before changing and plotted according to their original sizes (blue dots). The ‘sizes’ of the same synapses 24 hours after changing are shown as red dots. As in the experiments of <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi-1003846-g007" target="_blank">Figs. 7</a> and <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi-1003846-g008" target="_blank">8</a>, rank order is not preserved. The expected synaptic ‘sizes’, had scaling occurred multiplicatively, are shown as green dots.</p

FigShare

Changes in the fluorescence of individual synapses as a function of their initial fluorescence.

Author: Adiel Statman (638527)
Amir Minerbi (262841)
Maya Kaufman (149247)
Naama Brenner (28701)
Noam E. Ziv (147889)
Publication venue
Publication date
Field of study

Each dot represents one synapse. ΔF represents the change in fluorescence after a given time interval. Data were normalized by dividing the fluorescence of each synapse by the average fluorescence of all synapses at time t = 0 to allow pooling of data from multiple neurons irrespective of some variability in neuron-to-neuron expression levels. Solid lines are linear fits; vertical dashed lines highlight the average synaptic size ( = 1, after normalization). All data was obtained under baseline conditions from unperturbed networks. (A) Rat cortical neurons expressing PSD-95:EGFP; 1087 synapses from 10 neurons in 5 separate experiments. Images were collected at 30 min intervals; ΔF was measured after a 24 hour interval (see ref <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi.1003846-Kaufman1" target="_blank">[20]</a> for further details). (B) Mouse cortical neurons expressing PSD-95:mTurquoise; 554 synapses from 8 neurons in 6 separate experiments. Images were collected at 25 min intervals; ΔF was measured after a 15 hour interval (see ref <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi.1003846-FisherLavie2" target="_blank">[63]</a> for further details). (C) Mouse cortical neurons expressing munc13-1:EYFP; 554 synapses from 8 neurons in 6 separate experiments. Imaging was performed as in B (see ref <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi.1003846-FisherLavie2" target="_blank">[63]</a> for further details). (D) Rat cortical neurons expressing mTurquoise2:Gephyrin; 749 synapses from 27 neurons in 4 experiments. Images were collected at 60 min intervals; ΔF was measured after a 24 hour interval (Anna Rubinski and Noam E. Ziv, unpublished data).</p

FigShare

Kinetics of formation of new postsynaptic densities.

Author: Adiel Statman (638527)
Amir Minerbi (262841)
Maya Kaufman (149247)
Naama Brenner (28701)
Noam E. Ziv (147889)
Publication venue
Publication date
Field of study

A,B) The formation of a new PSD. Left panel: low magnification image of a dendrite 68 hours after the beginning of a time lapse session (started at 21 days in vitro). Right panels: gradual accumulation of PSD-95:EGFP at a new site (blue arrowhead). Bar: 10 µm. C) Time course of PSD-95:EGFP accumulation at the new site shown in A. The blue dots indicate the time-points of the images shown in B. D) Mean time course of new PSD formation in mature (>21 days in vitro) networks (average ± SEM). Data, pooled from 4 neurons, was aligned to the first time point at which a new PSD was observed. The fluorescence of each new synapse was normalized by subtracting the fluorescence value measured at its future location before a PSD was first detectable, and then divided by the background corrected mean fluorescence of the preexisting PSDs of that neuron. The number of new PSDs used to calculate the data points is shown as an orange line. E) Two simulated trajectories of new synapses, seeded with an initial value of 0.05 and evolved as a Kesten process with parameters = 0.962±0.06 and 〈η〉 = 0.038±0.03 (Gaussian distributions). The resulting trajectories were normalized as the experimental data shown in D. F) Mean time course of synapse formation calculated analytically by <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi.1003846.e020" target="_blank">equation (5)</a> (green) and averaged over 200 simulated Kesten trajectories (red, average ± SEM) evolved and normalized as described in E. Open circles represent the experimentally measured data shown in D. G) Synapse formation in developing networks: mean time course of new PSD formation in developing networks (10–13 days in vitro; average ± SEM). Data, pooled from 3 neurons, was normalized as in D. The number of PSDs used to calculate the data points is shown as an orange line. H) Mean time course of synapse formation in developing networks calculated analytically based on <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003846#pcbi.1003846.e020" target="_blank">equation (5)</a> (green) and averaged over 79 simulated Kesten trajectories (red, average ± SEM). The parameters used for these simulations and calculations were 〈ε〉 = 0.74±0.06 and 〈η〉 = 0.26±0.03 (Gaussian distributions; 〈η〉 was constrained by as explained in main text). Note that these reflect values for 10 minute steps (equivalent to = 0.405 for half hour steps). Open circles represent the experimentally measured data shown in G.</p

FigShare

Invariance of Kesten limiting distribution shape to different ε- and η- distributions.

Author: Adiel Statman (638527)
Amir Minerbi (262841)
Maya Kaufman (149247)
Naama Brenner (28701)
Noam E. Ziv (147889)
Publication venue
Publication date
Field of study

(A) Simulated limiting distributions of Kesten processes with the three different ε-distributions shown in inset, all belonging to the same μ-class 6, that is, 〈ε6〉 = 1. The distribution of η was held fixed. The same three distributions after scaling are shown on the right. (B) Simulated limiting distributions of Kesten processes with the three different η-distributions shown in the inset. The distribution of ε was held fixed. The same three distributions after scaling are shown on the right.</p

FigShare